- 2 - Van Hook et ah 

Appl. No. 09/662,832 

Amendments to the Claims 

The listing of claims will replace all prior versions, and listings of claims in the 
application. 

1. (currently amended) In a A microprocessor comput e r syst e m including a 
processor having a plurality of registers, a method for generating an align e d v e ctor of 
first width from two s e cond width v e ctors for singl e instruction multipl e data (SIMD) 
processing, comprising tho stops of : 
a first processing unit that 

in response to a first load instruction loads loading a first v e ctor plurality 
of data bytes from a memory unit into a first register, whoroin tho first vector contains a 
first byt e of tho aligned vector to b e generat e d; 

in response to a second load instruction loads loading a second vector 
plurality of data bytes from th e memory unit into a second registe r, and [[;]] 

in response to an alignment instruction determines d e t e rmining a starting 
data byte in the first register, wherein the starting data byte specifies the a first data byte 
of the an aligned vector^ [[;]] extracts e xtracting the aligned vector from the first register 
and the second register beginning from the a first bit in the starting byte of the first 
register continuing through bits in the second register^ [[;]] and replicates replicating the 
aligned vector into a third register such that the third register contains a plurality of data 
elements aligned for SIMD single instruction multiple data (SIMD) processin g; and 

a SIMD processing unit coupled to the first processing unit that operates on the 

aligned vector . 
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2-48. (canceled) 

49. (currently amended) A microprocessor having a plurality of registers, m e thod for 
generating an align e d v e ctor from two sourc e v e ctors for singl e instruction multipl e data 
(SEMD) processing, comprising the st e ps of : 
a first processing unit that 

ft) loading in response to a first load instruction loads a first source 

v e ctor plurality of data bytes into a first register, [[;]] 

(2) loading in response to a second load instruction loads a second 

sourc e v e ctor plurality of data bytes into a second register , and [[;]] 

(£) reading in response to a shuffle instruction reads a first plurality of 

data elements from said the first register and a second plurality of data elements from 
said the second register, [[;]] and 

(4) writing said writes the first plurality of data elements and said the 

second plurality of data elements into a third register in a particular order specified by 
the shuffle instruction to produce a targ e t vector having a plurality of data elements 
aligned for SIMD single instruction multiple data (SIMP) processin g; and 

a SIMP processing unit coupled to the first processing unit that operates on the 

vector. 
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50. (currently amended) The method as r e cit e d in microprocessor of claim 49, 
wherein s aid writing st e p compris e s: in response to the shuffle instruction, the first 
processing unit 

writing writes ev e n number e d, low e r zero-extended data elements of said the first 
register to said the third register ; and 

writing sign bits of odd numbered, low e r e l e m e nts of said first regist e r to said 

third regist e r. 

51. (currently amended) The method as r e cit e d in microprocessor of claim 49, 
wherein said writing step compris e s: in response to the shuffle instruction, the first 
processing unit 

writing e ven numb e r e d, upp e r e l e ments of said first r e gist e r to said third regist e r; 

XXTiXX 

writing writes sign bits of odd number e d, upp e r sign-extended data elements of 

said the first register to said the third register. 

52. (currently amended) A method for g e n e rating an ord e r e d s e t of e l e ments in a 
target v e ctor from el e m e nts in a first sourc e v e ctor and a s e cond source vector for singl e 
instruction multipl e data (SIMD) vector proc e ssing The microprocessor of claim 49 , 
comprising th e stops of: wherein in response to the shuffle instruction, the first 
processing unit 

(4} loading the first sourc e v e ctor into a first r e gist e r; 

(2} loading the second sourc e v e ctor into a s e cond regist e r; 
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(3) s e l e cting reads a first subset of data elements from said the first register, 

said the first subset of data elements comprising any one of the following groups of data 
elements from the first sourc e v e ctor register : odd elements, even elements, lower 
elements and upper elements; and 

(4) sel e cting reads a second subset of data elements from said the second 

register, said the second subset of data elements comprising any one of the following 
groups of data elements from the second sourc e v e ctor register : odd elements, even 
elements, lower elements and upper elements. 

53. (currently amended) The m e thod microprocessor of claim 52, furth e r comprising 
th e step of: wherein in response to the shuffle instruction, the first processing unit 

writing said writes the first subset of data elements and said the second subset of 

data elements into [[a]] the third register to facilitate a particular SIMD v e ctor 
processing operation, said first s ubset b e ing writt e n into any on e of th e following groups 
of e l e ments in said third register: upper elem e nts, odd e l e m e nts, and odd e l e m e nts in 
reverse order, and said second subs e t being writt e n into any on e of the following groups 
of el e m e nts in said third register: low e r e l e ments, ev e n elem e nts, and e ven e l e m e nts in 
r e v e rs e ord e r, wh e rein elements writt e n into said third register comprise th e targ e t v e ctor. 

54. (canceled) 
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55. (currently amended) The m e thod as r e cit e d in microprocessor of claim 49 54, 
wherein the vector in the third register is comprised of comprises a plurality of e ight 8- 
bit data elements. 



56. (currently amended) The method as r e cit e d in microprocessor of claim 49 54, 
wherein the vector in the third register is compris e d of four comprises a plurality of 16- 
bit data elements. 



57. (currently amended) The m e thod as r e cit e d in microprocessor of claim 1, 
wherein the starting data byte is specified as by a variable in a r e gist e r in an field of the 
alignment instruction. 

58. (currently amended) The method as recit e d in microprocessor of claim 1, 
wherein the first v e ctor plurality of data bytes and the second v e ctor plurality of data 
bytes are m loaded from contiguous locations in th e of a memory unit. 

59. (currently amended) The method as r e cit e d in microprocessor of claim 1, 
wherein the processo r first processing unit operates in a big-endian byte ordering mode. 

60. (currently amended) The m e thod as r e cit e d in microprocessor of claim 1, 
wherein the processor first processing unit operates in a little-endian byte ordering mode. 

61. (canceled) 
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62. (canceled) 

63. (canceled) 

64. (canceled) 

65. (new) A microprocessor having a plurality of registers, comprising: 
a first processing unit that 

in response to an alignment instruction determines a starting data byte in a 
first register specified by the alignment instruction, wherein the starting data byte 
specifies a first data byte of a first vector, extracts the first vector from the first register 
specified by the alignment instruction and a second register specified by the alignment 
instruction beginning from a first bit in the starting byte of the first register specified by 
the alignment instruction continuing through bits in the second register specified by the 
alignment instruction, and replicates the first vector into a third register specified by the 
alignment instruction such that the third register specified by the alignment instruction 
contains a plurality of data elements aligned for single instruction multiple data (SIMD) 
processing; and 
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in response to a shuffle instruction reads a first plurality of data elements 

from a first register specified by the shuffle instruction and a second plurality of data 

elements from a second register specified by the shuffle instruction, and writes the first 

plurality of data elements and the second plurality of data elements into a third register 

specified by the shuffle instruction, in a particular order specified by the shuffle 

instruction, to produce a second vector aligned for SIMD processing; and 

a SIMD processing unit coupled to the first processing unit that operates on 

vectors aligned for SIMD processing. 

66. (new) The microprocessor of claim 65, wherein in response to the shuffle 
instruction, the first processing unit writes zero-extended data elements of the first 
register to the third register. 

67. (new) The microprocessor of claim 65, wherein in response to the shuffle 
instruction, the first processing unit writes sign-extended data elements of the first 
register to the third register. 

68. (new) The microprocessor of claim 65, wherein the starting data byte is specified 
by a variable in a field of the alignment instruction. 

69. (new) The microprocessor of claim 65, wherein the first processing unit operates 
in a big-endian byte ordering mode. 
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70. (new) The microprocessor of claim 65, wherein the first processing unit operates 
in a little-endian byte ordering mode. 
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